k-medoids clustering algorithm Data Mining and Data Warehousing - Java, Java Swing, OOAD, MIS, DSA

Data Mining And Data Warehousing

k-medoids clustering algorithm

K-medoids

K-Medoids clustering algorithm similar to K-Means, but instead of using means (averages) to define cluster centers, it uses medoids — actual data points that are most centrally located within a cluster. This makes K-Medoids more robust to noise and outliers than K-Means.

Working of K-medoids

Initialize:

Randomly choose k data points as initial medoids (actual representative data point).
Assign:
Assign each data point to the nearest medoid (using a distance metric).
Update:
For each medoid, try swapping it with a non-medoid point and check if the total cost (sum of distances) decreases.
If yes, perform the swap.
Repeat:

Advantages

Robust to outliers since it uses actual data points.
Works with arbitrary distance metrics (Euclidean, Manhattan, etc.).
Better than K-Means when data has categorical or mixed types.

Disadvantages

Slower than K-Means, especially on large datasets (because of pairwise comparisons).
Still requires k to be known beforehand.
Not ideal for very high-dimensional data unless optimized.

Online-Academy

Look, Read, Understand, Apply

Data Mining And Data Warehousing